A Transitive Model for Extracting Translation Equivalents of Web Queries through Anchor Text Mining

نویسندگان

Wen-Hsiang Lu

Lee-Feng Chien

Hsi-Jian Lee

چکیده

One of the existing difficulties of cross-language information retrieval (CLIR) and Web search is the lack of appropriate translations of new terminology and proper names. Different from conventional approaches, in our previous research we developed an approach for exploiting Web anchor texts as live bilingual corpora and reducing the existing difficulties of query term translation. Although Web anchor texts, undoubtedly, are very valuable multilingual and wide-scoped hypertext resources, not every particular pair of languages contains sufficient anchor texts in the Web to extract corresponding translations in the language pair. For more generalized applications, in this paper we extend our previous approach by adding a phase of transitive (indirect) translation via an intermediate (third) language, and propose a transitive model to further exploit anchor-text mining in term translation extraction applications. Preliminary experimental results show that many query translations which cannot be obtained using the previous approach can be extracted with the improved approach.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards Web Mining of Query Translations for Cross-Language Information Retrieval in Digital Libraries

This paper proposes an efficient client-server-based query translation approach to allowing more feasible implementation of cross-language information retrieval (CLIR) services in digital library (DL) systems. A centralized query translation server is constructed to process the translation requests of cross-lingual queries from connected DL systems. To extract translations not covered by standa...

متن کامل

LiveTrans: Translation Suggestion for Cross-Language Web Search from Web Anchor Texts and Search Results

In this paper we will present a system, called LiveTrans, which can generate translation suggestions for given user queries and provide an English-Chinese cross-language search service for the retrieval of both Web pages and images. The system effectively utilizes two kinds of Web resources: anchor texts and search results. The developed anchor-text-based and search-result-based methods are com...

متن کامل

Presenting a method for extracting structured domain-dependent information from Farsi Web pages

Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...

متن کامل

Creating Multilingual Translation Lexicons with Regional Variations Using Web Corpora

The purpose of this paper is to automatically create multilingual translation lexicons with regional variations. We propose a transitive translation approach to determine translation variations across languages that have insufficient corpora for translation via the mining of bilingual search-result pages and clues of geographic information obtained from Web search engines. The experimental resu...

متن کامل

Mining Parenthetical Translations for Polish-English Lexica

Documents written in languages other than English sometimes include parenthetical English translations, usually for technical and scienti c terminology. Techniques had been developed for extracting such translations (as well as transliterations) from large Chinese text corpora. This paper presents methods for mining parenthetical translation in Polish texts. The main di erence between translati...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2002

A Transitive Model for Extracting Translation Equivalents of Web Queries through Anchor Text Mining

نویسندگان

چکیده

منابع مشابه

Towards Web Mining of Query Translations for Cross-Language Information Retrieval in Digital Libraries

LiveTrans: Translation Suggestion for Cross-Language Web Search from Web Anchor Texts and Search Results

Presenting a method for extracting structured domain-dependent information from Farsi Web pages

Creating Multilingual Translation Lexicons with Regional Variations Using Web Corpora

Mining Parenthetical Translations for Polish-English Lexica

عنوان ژورنال:

اشتراک گذاری